Font adaptation of an HMM-based OCR system
Identifieur interne : 000781 ( Main/Exploration ); précédent : 000780; suivant : 000782Font adaptation of an HMM-based OCR system
Auteurs : Kamel Ait-Mohand [France] ; Laurent Heutte [France] ; Thierry Paquet [France] ; Nicolas Ragot [France]Source :
- Proceedings of SPIE, the International Society for Optical Engineering [ 0277-786X ] ; 2010.
Descripteurs français
- Pascal (Inist)
- Wicri :
- topic : Recherche documentaire.
English descriptors
- KwdEn :
Abstract
We create a polyfont OCR recognizer using HMM (Hidden Markov models) models of character trained on a dataset of various fonts. We compare this system to monofont recognizers showing its decrease of performance when it is used to recognize unseen fonts. In order to fill this gap of performance, we adapt the parameters of the models of the polyfont recognizer to a new dataset of unseen fonts using four different adaptation algorithms. The results of our experiments show that the adapted system is far more accurate than the initial system although it does not reach the accuracy of a monofont recognizer.
Affiliations:
- France
- Centre-Val de Loire, Haute-Normandie, Région Centre, Région Normandie
- Saint-Etienne-du-Rouvray, Tours
- Université de Rouen
Links toward previous steps (curation, corpus...)
- to stream PascalFrancis, to step Corpus: 000160
- to stream PascalFrancis, to step Curation: 000617
- to stream PascalFrancis, to step Checkpoint: 000155
- to stream Main, to step Merge: 000786
- to stream Main, to step Curation: 000781
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Font adaptation of an HMM-based OCR system</title>
<author><name sortKey="Ait Mohand, Kamel" sort="Ait Mohand, Kamel" uniqKey="Ait Mohand K" first="Kamel" last="Ait-Mohand">Kamel Ait-Mohand</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Université de Rouen, LITIS EA 4108, Avenue de l'Université - BP 8</s1>
<s2>76801 Saint-Etienne-du-Rouvray</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName><region type="region" nuts="2">Région Normandie</region>
<region type="old region" nuts="2">Haute-Normandie</region>
<settlement type="city">Saint-Etienne-du-Rouvray</settlement>
</placeName>
<orgName type="university">Université de Rouen</orgName>
</affiliation>
</author>
<author><name sortKey="Heutte, Laurent" sort="Heutte, Laurent" uniqKey="Heutte L" first="Laurent" last="Heutte">Laurent Heutte</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Université de Rouen, LITIS EA 4108, Avenue de l'Université - BP 8</s1>
<s2>76801 Saint-Etienne-du-Rouvray</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName><region type="region" nuts="2">Région Normandie</region>
<region type="old region" nuts="2">Haute-Normandie</region>
<settlement type="city">Saint-Etienne-du-Rouvray</settlement>
</placeName>
<orgName type="university">Université de Rouen</orgName>
</affiliation>
</author>
<author><name sortKey="Paquet, Thierry" sort="Paquet, Thierry" uniqKey="Paquet T" first="Thierry" last="Paquet">Thierry Paquet</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Université de Rouen, LITIS EA 4108, Avenue de l'Université - BP 8</s1>
<s2>76801 Saint-Etienne-du-Rouvray</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName><region type="region" nuts="2">Région Normandie</region>
<region type="old region" nuts="2">Haute-Normandie</region>
<settlement type="city">Saint-Etienne-du-Rouvray</settlement>
</placeName>
<orgName type="university">Université de Rouen</orgName>
</affiliation>
</author>
<author><name sortKey="Ragot, Nicolas" sort="Ragot, Nicolas" uniqKey="Ragot N" first="Nicolas" last="Ragot">Nicolas Ragot</name>
<affiliation wicri:level="3"><inist:fA14 i1="02"><s1>Université François Rabelais Tours, LI EA 2101, 64 avenue Jean Portalis</s1>
<s2>37200 Tours</s2>
<s3>FRA</s3>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName><region type="region" nuts="2">Centre-Val de Loire</region>
<region type="old region" nuts="2">Région Centre</region>
<settlement type="city">Tours</settlement>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">10-0429723</idno>
<date when="2010">2010</date>
<idno type="stanalyst">PASCAL 10-0429723 INIST</idno>
<idno type="RBID">Pascal:10-0429723</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000160</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000617</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000155</idno>
<idno type="wicri:doubleKey">0277-786X:2010:Ait Mohand K:font:adaptation:of</idno>
<idno type="wicri:Area/Main/Merge">000786</idno>
<idno type="wicri:Area/Main/Curation">000781</idno>
<idno type="wicri:Area/Main/Exploration">000781</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Font adaptation of an HMM-based OCR system</title>
<author><name sortKey="Ait Mohand, Kamel" sort="Ait Mohand, Kamel" uniqKey="Ait Mohand K" first="Kamel" last="Ait-Mohand">Kamel Ait-Mohand</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Université de Rouen, LITIS EA 4108, Avenue de l'Université - BP 8</s1>
<s2>76801 Saint-Etienne-du-Rouvray</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName><region type="region" nuts="2">Région Normandie</region>
<region type="old region" nuts="2">Haute-Normandie</region>
<settlement type="city">Saint-Etienne-du-Rouvray</settlement>
</placeName>
<orgName type="university">Université de Rouen</orgName>
</affiliation>
</author>
<author><name sortKey="Heutte, Laurent" sort="Heutte, Laurent" uniqKey="Heutte L" first="Laurent" last="Heutte">Laurent Heutte</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Université de Rouen, LITIS EA 4108, Avenue de l'Université - BP 8</s1>
<s2>76801 Saint-Etienne-du-Rouvray</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName><region type="region" nuts="2">Région Normandie</region>
<region type="old region" nuts="2">Haute-Normandie</region>
<settlement type="city">Saint-Etienne-du-Rouvray</settlement>
</placeName>
<orgName type="university">Université de Rouen</orgName>
</affiliation>
</author>
<author><name sortKey="Paquet, Thierry" sort="Paquet, Thierry" uniqKey="Paquet T" first="Thierry" last="Paquet">Thierry Paquet</name>
<affiliation wicri:level="4"><inist:fA14 i1="01"><s1>Université de Rouen, LITIS EA 4108, Avenue de l'Université - BP 8</s1>
<s2>76801 Saint-Etienne-du-Rouvray</s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
<sZ>3 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName><region type="region" nuts="2">Région Normandie</region>
<region type="old region" nuts="2">Haute-Normandie</region>
<settlement type="city">Saint-Etienne-du-Rouvray</settlement>
</placeName>
<orgName type="university">Université de Rouen</orgName>
</affiliation>
</author>
<author><name sortKey="Ragot, Nicolas" sort="Ragot, Nicolas" uniqKey="Ragot N" first="Nicolas" last="Ragot">Nicolas Ragot</name>
<affiliation wicri:level="3"><inist:fA14 i1="02"><s1>Université François Rabelais Tours, LI EA 2101, 64 avenue Jean Portalis</s1>
<s2>37200 Tours</s2>
<s3>FRA</s3>
<sZ>4 aut.</sZ>
</inist:fA14>
<country>France</country>
<placeName><region type="region" nuts="2">Centre-Val de Loire</region>
<region type="old region" nuts="2">Région Centre</region>
<settlement type="city">Tours</settlement>
</placeName>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Proceedings of SPIE, the International Society for Optical Engineering</title>
<title level="j" type="abbreviated">Proc. SPIE Int. Soc. Opt. Eng.</title>
<idno type="ISSN">0277-786X</idno>
<imprint><date when="2010">2010</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Proceedings of SPIE, the International Society for Optical Engineering</title>
<title level="j" type="abbreviated">Proc. SPIE Int. Soc. Opt. Eng.</title>
<idno type="ISSN">0277-786X</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Accuracy</term>
<term>Algorithms</term>
<term>Document retrieval</term>
<term>Hidden Markov models</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Performance evaluation</term>
<term>Probabilistic approach</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Algorithme</term>
<term>Reconnaissance forme</term>
<term>Recherche documentaire</term>
<term>Modèle Markov variable cachée</term>
<term>Reconnaissance optique caractère</term>
<term>Evaluation performance</term>
<term>Précision</term>
<term>Approche probabiliste</term>
<term>0130C</term>
<term>4230S</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Recherche documentaire</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">We create a polyfont OCR recognizer using HMM (Hidden Markov models) models of character trained on a dataset of various fonts. We compare this system to monofont recognizers showing its decrease of performance when it is used to recognize unseen fonts. In order to fill this gap of performance, we adapt the parameters of the models of the polyfont recognizer to a new dataset of unseen fonts using four different adaptation algorithms. The results of our experiments show that the adapted system is far more accurate than the initial system although it does not reach the accuracy of a monofont recognizer.</div>
</front>
</TEI>
<affiliations><list><country><li>France</li>
</country>
<region><li>Centre-Val de Loire</li>
<li>Haute-Normandie</li>
<li>Région Centre</li>
<li>Région Normandie</li>
</region>
<settlement><li>Saint-Etienne-du-Rouvray</li>
<li>Tours</li>
</settlement>
<orgName><li>Université de Rouen</li>
</orgName>
</list>
<tree><country name="France"><region name="Région Normandie"><name sortKey="Ait Mohand, Kamel" sort="Ait Mohand, Kamel" uniqKey="Ait Mohand K" first="Kamel" last="Ait-Mohand">Kamel Ait-Mohand</name>
</region>
<name sortKey="Heutte, Laurent" sort="Heutte, Laurent" uniqKey="Heutte L" first="Laurent" last="Heutte">Laurent Heutte</name>
<name sortKey="Paquet, Thierry" sort="Paquet, Thierry" uniqKey="Paquet T" first="Thierry" last="Paquet">Thierry Paquet</name>
<name sortKey="Ragot, Nicolas" sort="Ragot, Nicolas" uniqKey="Ragot N" first="Nicolas" last="Ragot">Nicolas Ragot</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000781 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000781 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= Pascal:10-0429723 |texte= Font adaptation of an HMM-based OCR system }}
This area was generated with Dilib version V0.6.32. |